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2. Statement of Problem 

Given a random variable x on a probability space P, let f(x) be the density 
function associated with x. Let 

r x 

F(x) = J f(s)ds (1) 

-CD 

be the cumulative density function associated with x. The problem is: 

Given a random sample of size n,{x , .... x^jcan the density function f(x) 
be approximated by a smooth function using this data? 

The first work done on tins problem is by Benova, Kendall, and Stevetanov [4] 
in a function space setting. They define an approximation to f(x) called a Histospline 
by a homeomorphism of the Hilbert Space of all histograms to a subspace of a 
Hilbert Space of smooth functions. In a recent paper by I J. Schoenberg [14] he 
reconstructs the Histospline in a simpler setting and forms his splinogram with the 
variation -diminishing property using B -splines. 

In this paper another approach is taken to approximate a histogram. 
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3. Splines 

A spline is a. mechanical device used by draftsmen to draw smooth curves. 

It consists of a piece of wood or plastic with lead weights placed on the points where 
it is desired that the spline pass through. These points are called knots. The 
differential equations for abending beam with weights was solved by Holladay in [11]. 
This was one of the most important papers in the development of spline functions. 

The solution was piecewise cubic polynomials which had the first and second derivatives 
equal at each knot. 

Actuarians have been using spline functions since the 1930's for smoothing life 
expectancy tables (see Greville [9] for a survey of the early work). Also, in the ship 
building industry they have been using these in moving weights around on beams (called 
lofting) to get the hull of a ship to match the design (see Berger et. al. [2]). The work 
that is the basis for most mathematical investigations of splines is I.J. Schoenberg’s 
work [13]. For B’-splines Greville [10] is the best reference. 

A spline of degree r with m knots 


x, £ x 0 <; ... s x 
1 2 m 


is a function s(x) 

(1) s(x): ft-*ft, where {real numbers] = ft 

s(x)i, s = P . ,a polynomial of degree r, j = 1, 2, .... m 

K5 y V rj 

(3) s(x) e cf 1 = {f e ft|/ r ^ is continuous] . 

bli 

Note that a spline of degree zero is a step function and a spline of degree one is a 
polygon. The advantage of using splines rather than polynomials to fit n data points 
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is a polynomial of degree n-.l or less is required while a spline of degree r.with r 
fixed, can be used and r «u . A very simple type of spline is the truncated 

power function; 

m .. ^ n 

.x , if x > 0 
m f 
x = j 

+ 0 , ifx^O 

Let S (x ..... x ) be the set of all splines of degree r with m knotes.An important 
r 1 • m 

theorem in the theory of splines is 

s(x) e S r (x 1 , . . . , x m ) 3 ! P r (x) (and) c such that 
m 

s(x) = P (x) + 2 c.(x-- x ) r , where P (x) is a polynomial of degree r. 

r i i J “i - 

i=i 

See Greville [9], But this form gives rise to ill-conditioned matrices when one actually 
solves for the c and P^(x) (see Schumaker[l5] ). We shall use the B -splines of Curry 
and Schoenberg [6]. For equally spaced knots of odd degree r = 2k-l, where k is 
a natural number, they are given by 

k . , 2k 

e r (x) = ~ r. (-irWfl -x)+ ( 3 ) 

j=-k 

using the form of Rosen [12], For k = 2i.e. for cubic B-splines the graph of P r (x) is 
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For this paper, we shall have k = 2. The function. g_^(x) is symmetric about x = 0, 
bell-shaped and non-negative on the interval [-k, k]. It is identically zero off its support 
[-k, k]. The properties of £^(x) are 


(a) 

P (x) >0 , 0 < |x| < k 

( 4 ) 

(b) 

^ r (x) - P r (-x) , 

( 5 ) 

(c) 

p (0) > p (x) for x ^ 0 » 

(6) 

(d) 

CO k_1 

E |3(i)|= E p(i)=l e (x)dx=l 

* ^ i -f ^ ^ 

(7) 


* -*• . -f 1 -L v 

l=»CO 1=1 -k 


It is obvious that these properties make the B -spline a natural candidate for a basis 
for a probability density function f(x) such that 


(a) f(x) a 0 


( 8 ) 


(b) 


f(x) dx = 


1 


( 9 ) 


What is desired is a function f (x) that satisfies (8) and (9) and is a good 
approximation to f(x). Let f (x) be this approximating function 

ci 

m 

f (x) - S a.cp.(x) 
a j-i 1 ] 

where the cp. are B- splines. 
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4. Algorithm for Histogram 

The algorithm for the histogram goes as follows. The "n" observations are 
taken from a probability distribution and ordered such that 


x, 5X. ... ^ x <: x . (10) 

1 2 • n-1 n 

Note that if the sample is taken from a continuous distribution then restricted inequality 
may be placed between the data points. The next step is to determine where the knots 
are to be placed. In this paper they are placed at the integers between the data points 
and the first integer greater than and the first integer less than x^. Number these 
knots as follows 

x(l) <x(2) . . . <x(m) . (11) 


Next construct the histogram for the n observations on the points 


x. = [x(i) + x(i+l)]/2 . (i = 1 , 2, .... m-I) 


x = x_ - 1 , x = x , + 1 . 

0 1 m m-1 
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5, Linear Programming 

There is nothing novel about using linear programming for smoothing data. 

A method of finding the best line to fit data was used as early as 1820 by J.B. Fourier [8]. 
The simplex method of linear programming was introduced by G. Dantzig in the 40's 
[7], Perhaps the first structuring of a similar problem for linear programming was 
by A. Charnes, W.W. Cooper et. al. [5], The first use of linear programming for 
fitting data was by H.M. Wagner f 17]. Much work has bees done recently by Barrodale 
et. al. [1] in using linear programming for fitting data. For: unconstrained approximation 
function this is the most efficient algorithm devised. TheMsear programming 
formulation of this paper is after J.B. Rosen [12]. The main, general reference for 
linear programming in this paper is the book of A. Spivey and 1 R. M. Thrall [16], 

With the m points 


(x(i) , y.) 


m 

i=l 


obtained from the Histogram Algorithm as input make the following definitions: 


T 

y = 



y m> 


and 


cp.(x) = 3 (x - x(j)) t (j = 1, 2, m) . 

J ^ 


( 14 ) 


Let 



a 

m 


) 
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and 


cp (*) = (cpj^x) > cp 2 (x) , . . . , cp m Cx\0 , ] 


then the function that approximates these m data points is 


T m 

f (x) = a cp(x) = S a.cp.(x) . 

a i=l J J 

It is desired to have f (x) approximate the data points in the £ and l 

a 50 


H f a < x > - yll«“ max l f a < x i> ' y. I 

1 - 1-9 • * ♦ 9 


m 


H f a (x > “ y l'l = l f a (x i )_y i^ ' 

i=l 


( 15 ) 

. norm where: 

( 16 ) 


( 17 ) 
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6, l Norm 

CO 

Now to formulate the l norm as a linear programming problem let yeft 

CO 

and (17) is equivalent to 


Min{y|-v^f a (x.) - y^y, i = 1, 2, m} . 

a. V 


(18) 


If y is an optimal solution to (18) then 


y = llyx) - ylL ' 


otherwise a smaller value could be found. 
Now write (18) as 


m 


2 atp(x)-y i-y 
j=l J J 


(i - 1> 2* 1 1 m ^ ) > 


m 


-E a.cp.(x ) + y. ^ -Y 
. t yy i x 
J=1 


or 


m 


f £ am(x.)+Y^y. 

j=l J J 


(i = 1, 2, . . . , m) , 


m 


-E a cp (x ) + Ya-y' 
V. j=l J ) 1 


(19) 


Now (18) can be formulated as a linear programming problem as 


Min Y 

a 


(19) holds . 


( 20 ) 
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To put (20) in a matrix form make the follow definations 


F = 


W * * • Cp l (X m ) 


CD (X,) . . . CD (X ) 

% 1 un m 


J 


and F is a m x m matrix. 

Note tli at at this point we could solve the system of equations 


Fa = y (21) 

to obtain the coefficients fa ] but this would give no assurance that equations (8) 

i 1=1 

or (9) are satisfied by f (x). Let 

3 - 

co i/2 

L [a,b] = [f KJj f(x) l dx ) < " 3 • <22) 

It is well known that 

L 2 - - 

[x(l), x(m)] 

is a Hilbert Space and thus has a well defined inner product. One could obtain the 
projection of the Histogram , which is in L | (1) _ - (m)] , on the L 2 - (1)> - (m)] Actions 

which integrate to one. This would satisfy equations (9) but not necessarily equation (8). 
Continuing with the linear programming formulation let 
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_T . T T, 2m 
C =(y , -y )eR 


,„T . T , m+1 

W = (a , y) e R > 


T m+1 

b = (0, . . . , 0, 1) c R. , 


e^ = (1, . . . , 1) e ft” . 


Define A to be 


r t 

'F e 


A T =I 


T 

•F e 


(23) 


and A is a (2m) x (m+1) matrix. Requiring {.a. :> 0} (20) is 


Min b T W 


A T WaC 


(24) 


W s 0 . 


Since W a 0 the coefficients {a.}”^ are found to be positive and so 

m 

f (x)= S a cp (x) 5 0 (25) 

j=l J J 

and equation (8) is satisfied. We now proceed to show equation (9) can be satisfied 
by adding one constraint. Because 
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OO 

J cp.(x)dx=l (j = 3, . . . , m-2) 

-co J 

00 oo 

f cp-(x)dx = f cp <x)dx = ,95833 

00 CD 

f cp. (x)dx =| cp (x)dx= . 5 
J 1 J m 


(26) 


it follows that 

°° m ' m 

j f (x)dx = J E a cp (x)dx = £ a J cp.(x)dx = 

-CO -CO j = l i J j=l J 

m-2 

,5a. + ,95833a + E a. + ,95833a + . 5a 

1 2 . „ i m-1 m 


To satisfy (9) it is required that 


J 1 

-00 


so set 


d T W = 1 


(27) 


where 


d T = (0.5, 


.95833, l, 


1, . 95833, .5, 0) e R. 


m+1 


If equation (27) is satisfied then equation (9) is satisfied. The complete formulation 
of the problem is obtained by adding equation (27) to the constraints of equations (24). 
The .95803 and .5 arrise from the fact that the two end basis functions have 
support outside of [x(l), x(m)] . 
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7. 


JL Norm 


For the £ norm redefine y 


m 


Equation (17) becomes 
m 

Min { E v. I - y. s f (x.) - y. £ y , i = 1, 2, . . m} . 

. „ V i a l l i 
a, y 1=1 

* 

If y is the optimal solution to (28), then 


(28) 


m 


s Y , = l|f W -y|L • 


i=l 


But putting a subscript on the y in equations (19) the formulation of the constraints 
for equation (28) becomes 


m 


£ a cp (x. ) - y. s y. 

j=l J 


x *■ 1 , 2, « « • , m 


( 29 ) 


m 


-E acp (x.) - Y, = -V i 
)=1 


The complete l formulation as a linear programming problem is 
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3 


m 

Min T, y. 
i=l 1 

(29) holds . 


(30) 


By defining 



0 , 


e)cR 


2m 


and 



and 


„,T . T T. 

W * (a , y ) 


equation (24) is the matrix form of equation (30). Adding constraint (27) gives a 
formulation to obtain f (x) that satisfies equation (8) and (9). 

cl 
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8. Remarks 

T 

The B-spline basis functions used gives a sparce coefficient matrix A . 

The central B -splines are used even as the end splines with the support outside the 
interval [x(l), x(m)]. For equally spaced knots this presents no problem. For m = 7 
the F matrix is 


4 1 0 0 0 0 0* 

1 4 1 0 0 0 0 


F = 1/6 


0 

0 


1 

0 


4 

1 


10 0 0 
4 10 0 


0 0 0 1 4 1 0 


which corresponds to 


0 0 0 0 1 4 1 

0 0 0 0 0 1 4 

» 



x(l) x(2) x(3) x(4) x(5) x(6) x(7) 


This formulation has a diagonally dominant, symmetric matrix F, which has a nice 


closed form for F ^ given by 


where 


- 1 6 

F - — [a ] is the m x m matrix 
b ij 
m J 


b. .b . , if i = j 

l-l m -1 


(-l) i+ ^b. .b , if j>i 
l-l m-1 


a. . = 

ij 


a.. 

n 


, if j < i 


where b^ = 1 , b^ = 4 , b^ = ^b^_^ k = 2, 3, . . . , m. See [19]. 
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Tills of course makes solving equation (21) trivial* For more knots there will be a 

much larger percentage of zero coefficients as the size of the Jr* matrix increases, 
nr 

hence, the A matrix will be sparce for large m. The revised simplex algorithm 
leaves the zero entries of the initial trableau zero at each iteration. This is not 
true for the simplex algorithm. The algorithm of Barrodale uses the simplex algorithm. 
Because of its speed perhaps, it could be adapted to include the area matching constraint 
(27) and still be very fast. 

The l formulation of A is a (2m) x (2m) matrix so no computational speed 
can be expected by solving the dual of equations (24) and (27). Where as for the 
norm, A is a (m+1) x (2m) matrix and some computational improvement might occur 
from solving the dual system of equations (24) and (27).. 
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9. Algorithm for yx) 


(1) Choose F(x) = J f(s) ds 

-00 

(2) Generate random numbers Z^ in (0, 1), i = 1, 2, . . . , n. 

(3) For each i find F ^(Z^) = , i = 1, 2, . . « , n. 

(4) Order the {x.} n , in increasing order. 

v ’ i i=l 

(5) Choose knots (x(i)]^ . 

(6) Form a normalized histogram of the data with respect to the knots to 

obtain {x(i), • 

(7) Use the revised simplex algorithm for an ^ or t a fit to 

[x(i), y .)}“ with cubic B-splines as basis elements to obtain {a.}!^ and hence fjx). 

(8) Compare f(x) and f (x). 

3 . 


For raw data use steps (4) to (7) in the algorithm. 
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10. Example(Bliss Histogram [3]) 

This example was used so as to compare what is obtained by this algorithm 

with the Histospline [4 ] and the Splinogram. [14]. Let 



1 

f 1 ’ 

if x e [a,b] 


1! 

1 

1 °> 

Otherwise, 



L 


then the Histogram 

I-I is defined by 
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H = E h. y 
1=11 1 1 

[X., X 


where 





h = 1/578 

h = 104/578 

O 


h 2 = 5/578 

h = 66/578 


h * 20/578 

O 

h = 44/578 


h. = 38/578 
4 

h « 18/578 

'• 

h g = 50/578 

‘ h 12 = 10/578 


h r = 110/578 
0 

h 13 - 1/578 


h ? = 110/578 

h 14 ‘ 1/578 

and 


{x i = 

1+9 -<! 


This data is normally distributed. The raw data was not given so the algorithm had to 
start from the histogram, f (x) for this example is unimodal. 

cl 
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For the -t approximation to the data points, y = .04370 the obtained were 


a(l) = 

.00144 

a(8) = . m 59 

a(2) = 

.00457 

a(9) = ..«50 

a.(3 ) = 

.03130 

a(10) = .88-739 

a(4) = 

.07663 

a(ll) = 0.0- 

a(5) = 

.05337 

a(l2) = .01032 

a(6) = 

.22587 

a(1.3) = <LQ 

a(7) = 

.17713 

a(14) = t.0 


The graph of the function is in Figure 1, 

For the -t, approximation to the data point y = .042® 


a(l) = 

.00144 

a(8) * .20002 

a(2) = 

. 00457 

a(9) = .89334 

a(3) = 

.03130 

a(10) = .09334 

a<4) = 

. 07663 

a(U) = .01777 

a(5) = 

. 05337 

a(l2) = .02136 

a(6) = 

.22590 

a(l3) = 0. 0 


a(7) = .17702 


a(l4) = .00258 
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11. Conclusion 

The Histospline of [4] can go negative and is not unimodal, where as f a (x) 

cannot go negative and was unimodal for the example of the Bliss histogram. The 

splinogram of [14] was called to my attention after this algorithm was completed 

but not written up. The splinogram for the Bliss histogram appears smoother than 

f (xl in the l or L norm, but f (x) has one more continuous derivative than the 
a oo 1 a 

splinogram. Also, f (x) could by just changing k in the program have as many 

cl 

continuous derivatives as desired, however there would be a loss of the tridiagonal 
structure of F hence, computing the {a } x would take longer. The A matrix 
could be modified to where it had the area matching property at each point {x(i)}. =1 
like the Histospline and still satisfy equation (8). So this formulation is more versitle 
than either of the previous ones. Also, it is not necessary to require a nistogram 
in the algorithm and alternate methods that by pass this requirement could be substituted. 
The algorithm should be tested for recovering several probability density 


functions. 
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12. Appendix A. Test of Normal Distribution 

To investigate the algorithm for the normal distribution, normal random samples 
of size 300, 600, and 900 were generated using the algorithm of Moshman [18]. The 
algorithm of this paper is then used to obtain the approximating function for these 
sample sizes. The graphs of the approximating functions are compared with the 
graph of the normal distribution for sample sizes 300, 600, and 900 on pages 23, 

24, and 25, respectively. 

A bimodal normal distribution is also tested to see how well the algorithm 
distinguishes between the unimodal and bimodal functions. The results for this test 
is found on page 26. 
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